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Abstract 

We  prove  a  moderate  deviation  principle  for  the  continuous  time  in¬ 
terpolation  of  discrete  time  recursive  stochastic  processes.  The  meth¬ 
ods  of  proof  are  somewhat  different  from  the  corresponding  large  de¬ 
viation  result,  and  in  particular  the  proof  of  the  upper  bound  is  more 
complicated. 


1  Introduction 

In  this  paper  we  consider  Revalued  discrete  time  processes  of  the  form 

A?+1  =  X?  +  h(X?)  +  1  Vi(X? ),  Xq  =  xo, 
n  n 

where  {uj(-)}jeN0  are  zero  mean  random  independent  and  identically  distrib¬ 
uted  (iid)  vector  fields,  and  focus  on  their  continuous  time  piecewise  linear 
interpolations  {Xn(t)}o<t<T  with  Xn(i/n)  =  V”  (see  (2.5)  for  the  precise 
definition).  Under  certain  conditions  there  is  a  law  of  large  number  limit 
X°  6  C([0,  T\  :  Rrf),  and  the  large  deviations  of  Xn  from  this  limit  have 
been  studied  extensively  (see,  e.g.,  [1,  10,  12,  15,  17]).  Here  we  introduce  a 
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scaling  a(n)  satisfying  a(n)  — >  0  and  a(n)y/n  — ■>  oo,  and  study  the  amplified 
difference  between  Xn  and  its  noiseless  version  Xn,°  (see  Section  2  for  the 
definition  of  An,°): 

Yn  =  a(n)^/n(Xn  —  Xn,°). 

Under  Condition  2.1  introduced  below  supiej0  Tj  || X°(t)  —  Xn,0(f)||  ~  0(l/n), 
and  hence  Yn  will  behave  the  same  asymptotically  as  a(n)^/n(Xn  —  A0) 
We  demonstrate,  under  weaker  conditions  on  the  noise  Vi(-)  than  are  neces¬ 
sary  when  considering  Xn,  that  Yn  satisfies  the  large  deviation  principle  on 
<7([0,  T]  :  Rrf)  with  a  “Gaussian”  type  rate  function.  As  is  customary  for 
this  type  of  scaling,  we  refer  to  this  as  moderate  deviations. 

To  demonstrate  this  result  we  prove  the  equivalent  Laplace  principle, 
which  involves  evaluating  limits  of  quantities  of  the  form 

when  F  is  bounded  and  continuous.  This  is  done  by  representing  each 
of  these  quantities  in  terms  of  a  stochastic  control  problem,  and  then  using 
weak  convergence  methods  as  in  [12].  Key  results  needed  in  this  approach  are 
establishing  tightness  of  controls  and  controlled  processes,  and  identifying 
their  limits. 

While  one  might  expect  the  proof  of  this  moderate  deviations  result  to  be 
similar  to  the  corresponding  large  deviations  result,  there  are  important  dif¬ 
ferences.  For  example,  the  tightness  proof  is  significantly  more  complicated 
in  the  case  of  moderate  deviations  than  it  is  in  the  case  of  large  deviations. 
For  large  deviations  one  is  able  to  establish  an  a  priori  bound  on  certain 
relative  entropy  costs  associated  with  any  sequence  of  nearly  minimizing 
controls,  and  under  this  boundedness  of  the  relative  entropy  costs,  the  em¬ 
pirical  measures  of  the  controlled  driving  noises  as  well  as  the  controlled 
processes  are  tight.  However,  owing  to  the  scaling  in  moderate  deviations, 
even  with  the  information  that  the  analogous  relative  entropy  costs  decay 
like  0(l/a(n)2n),  tightness  of  the  empirical  measures  of  the  noises  does  not 
hold.  Instead,  one  must  consider  empirical  measures  of  the  conditional 
means  of  the  noises,  and  additional  effort  is  required  for  the  law  of  large 
numbers  type  result  that  shows  that  the  conditional  means  are  adequate  to 
determine  the  limit.  This  extra  difficulty  arises  for  moderate  deviations 
(even  with  the  vanishing  relative  entropy  costs),  because  the  noise  itself  is 
being  amplified  by  a(n)y/n. 

A  second  way  in  which  the  proofs  for  large  and  moderate  deviations 
differ  is  in  their  treatment  of  degenerate  noise,  i.e.,  problems  where  the 


a(n)2  log E 
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support  of  Vi(-)  is  not  all  of  Rrf.  This  leads  to  significant  difficulties  in 
the  proof  of  the  large  deviation  lower  bound,  and  requires  a  delicate  and 
involved  mollification  argument.  In  contrast,  the  proof  in  the  setting  of 
moderate  deviations,  though  more  involved  than  the  nondegenerate  case,  is 
much  more  straightforward. 

As  a  potential  application  of  these  results  we  mention  their  usefulness 
in  the  design  and  analysis  of  Monte  Carlo  schemes.  It  is  well  known  that 
accelerated  Monte  Carlo  schemes  (e.g.,  importance  sampling  and  splitting) 
benefit  by  using  information  contained  in  the  large  deviation  rate  function 
as  part  of  the  algorithm  design  (e.g.,  [3,  8,  13,  14]).  In  a  situation  where 
one  considers  events  of  small  but  not  too  small  probability  one  may  find 
the  moderate  deviation  approximation  both  adequate  and  relatively  easy 
to  apply,  since  moderate  deviations  lead  to  situations  where  the  objects 
needed  to  design  an  efficient  scheme  can  be  explicitly  constructed  in  terms 
of  solutions  to  the  linear-quadratic  regulator.  These  issues  will  be  explored 
elsewhere. 

The  existing  literature  on  moderate  deviations  considers  various  settings. 
Baldi  [2]  considers  the  same  scaling  used  here  but  with  no  state  depend¬ 
ence.  For  the  empirical  measure  of  a  Markov  chain,  de  Acosta  [7]  and  de 
Acosta  and  Chen  [6]  prove  lower  and  upper  bounds,  respectively.  Guillin 
[18]  considers  inhomogeneous  functionals  of  a  “fast”  continuous  time  ergodic 
Markov  chain,  and  in  [19]  this  is  extended  to  a  small  noise  diffusion  whose 
coefficients  depend  on  the  “fast”  Markov  chain.  There  are  also  results  for 
martingale  differences  such  as  Dembo  [9],  Gao  [16],  and  Djellout  [11].  For 
various  reasons,  the  issues  previously  mentioned  regarding  the  difficulties  in 
the  proof  of  the  upper  bound  and  the  simplification  in  the  lower  bound  for 
degenerate  noise  do  not  play  a  role  in  these  papers.  For  instance,  proving 
tightness  in  a  moderate  deviations  setting  for  continuous  time  processes  is 
typically  much  easier.  This  is  because  measures  on  path  space  that  have 
bounded  relative  entropy  with  respect  to  Wiener  measure  have  significantly 
less  variability  than  those  with  bounded  relative  entropy  with  respect  to 
a  discrete  time  process.  In  particular,  bounded  relative  entropy  automat¬ 
ically  restricts  to  what  one  could  consider  to  be  “exponential  tilts”  of  the 
original  distribution  in  continuous  time,  which  does  not  happen  in  discrete 
time,  and  is  the  reason  more  effort  must  be  put  into  the  proof  of  tightness. 
This  is  illustrated  by  the  convenient  alternative  formulations  of  the  relat¬ 
ive  entropy  representation  for  some  continuous  time  processes  (see  [4]  for 
Brownian  motion  and  [5]  for  Poisson  random  measures). 

The  paper  is  organized  as  follows.  Section  2  gives  the  statement  of  the 
problem  and  notation.  Section  3  contains  the  proof  of  tightness  and  the 
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characterization  of  limits,  which  account  for  most  of  the  mathematical  diffi¬ 
culties,  and  are  also  the  main  results  needed  to  prove  the  Laplace  principle. 
Sections  4  and  5  give  the  proofs  of  the  upper  and  lower  Laplace  bounds. 
Although  all  proofs  are  given  for  the  time  interval  [0, 1],  they  extend  with 
only  notational  differences  to  [0 ,  T]  for  any  T  £  (0,oo). 

Aknowledgement.  The  authors  thank  the  referees  for  several  sugges¬ 
tions  that  improved  the  paper. 

2  Background  and  Notation 

Let 

Xf+1  =  X?  +  -b( X?)  +  -Vi(xp),  Xq  =  xo 
n  n 

where  the  {^i(-)}ieNo  are  zero  mean  iid  vector  fields  with  distribution  given 
by  the  stochastic  kernel  fix.  Thus  if  £>(Rrf)  is  the  Borel  <r-algebra  on  Rd, 
then  x  — ►  yx(B)  is  measurable  for  all  B  £  £?(Md),  nx{')  is  a  probability 
measure  on  for  all  x  £  Rd,  and  P(vi(x )  £  B)  =  yx(B)  for  all  x  £  Rd, 

B  £  £>(Md)  and  i  £  No-  Define 

Hc(x,  a)  =  log  (  I  e{y’a)  nx(dy)\ 

\J  Rd  J 

for  a  £  Mrf.  The  subscript  c  reflects  the  fact  that  this  log  moment  generating 
function  uses  the  centered  distribution  / ix ,  rather  than  the  usual  H(x,a)  = 
Hc(x,  a)  +  (a,  b(x)}.  We  will  use  the  following. 

Condition  2.1  •  There  exists  A  >  0  and  Kmgf  <  oo  such  that 

sup  sup  Hc(x,a)  <  Kmgf.  (2.1) 

xGRd  ||a||<A 


•  x  — ►  yx{dy)  is  continuous  with  respect  to  the  topology  of  weak  conver¬ 
gence. 

•  b(x)  is  continuously  differentiable,  and  the  norm  of  both  b(x)  and  its 
derivative  are  uniformly  bounded  by  some  constant  Kb  <  oo. 

Throughout  this  paper  we  let  ||a||^  =  (a,Aa)  for  any  a  £  and 
symmetric,  nonnegative  definite  matrix  A.  Define 

Aj(x)  =  /  Viyjtix(dy), 

J  Rd 
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and  note  that  the  weak  continuity  of  [ix  with  respect  to  x  and  (2.1)  ensure 
that  A  (x)  is  continuous  in  x  and  its  norm  is  uniformly  bounded  by  some 
constant  KA •  Note  that 


dHc(x,  0) 
dai 


ymx{dy)  = 0 


and 

d2Hc(x,  0)  f 

=  JwvwMv)  =  a,M) 

for  all  i,j  £  {1,. ..  ,d}  and  x  £  Md,  and  that  A(x)  is  nonnegative-definite 
and  symmetric.  For  x  £  M,d  we  can  therefore  write 


A(x)  =  Q(x)A(x)Qt(x), 


where  Q(x)  is  an  orthogonal  matrix  whose  columns  are  the  eigenvectors  of 
A{x)  and  A(x)  is  the  diagonal  matrix  consisting  of  the  eigenvalues  of  A{x) 
in  descending  order.  In  what  follows  we  define  A_1(x)  to  be  the  diagonal 
matrix  with  diagonal  entries  equal  to  the  inverse  of  the  corresponding  eigen¬ 
value  for  the  positive  eigenvalues,  and  equal  to  oo  for  the  zero  eigenvalues. 
Then  when  we  write 


MU-qx) 


a 


ll(3(a:)A-1(a;)QT(3;)  ’ 


(2.2) 


we  mean  a  value  of  oo  for  a  £  not  in  the  linear  span  of  the  eigenvectors 
corresponding  to  the  positive  eigenvalues,  and  the  standard  value  for  vectors 
a  £  M.d  in  that  linear  span.  Assumption  (2.1)  implies  there  exists  some 
Kda  <  co  and  A  da  &  (0,  A]  (independent  of  x)  such  that 


d3Hc(x,  a) 
sup  sup  max  - — - — - — 
x£Rd  \\a\\<\DA  ootiOajOak 


^  Kda 
-  d 3  ’ 


(2.3) 


and  consequently  for  all  ||a||  <  A  da  and  all  x  £ 

\  HI A(x)  ~  IMI3  kda  <  Hc(x,  a)  <  ^  ||a||^(x)  +  ||a||3  KDA.  (2.4) 

Define  the  continuous  time  linear  interpolation  of  AT  by  X n(i/ri)  =  A” 
for  i  =  0, ...,  n  and 


A n(t)  =  (*  +  !  —  nt) Af  +  {nt  -  i) Af+i  (2.5) 
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for  t  E  (■ i/n,i/n  +  1  jn).  In  addition,  define 

xi+ 1  =  Xi  +nb\Xi  ),X  0  =  X° 

and  let  Xn,0(t)  be  the  analogous  continuous  time  linear  interpolation  given 
by  Xn’°(i/n )  =  A”’0  for  i  =  0,  ...,n  and 

Xn’°(t)  =  (i  +  1  -  +  (nt  -  i)X£° 

for  t  G  (i/n,i/n  +  1/n).  Clearly  An’°(t)  — >  A'°(t)  in  C([0, 1]  :  Rd),  where 

A°(t)  =  [  b(X°(s))ds  +  x0. 

Jo 

Since  Ev%{x )  =  0  for  all  x  G  Rd,  we  know  that  Xn{t)  — >  X°(t)  in  C([0, 1]  : 
Md)  in  probability.  One  can  estimate  probabilities  for  events  involving 
paths  outside  the  law  of  large  numbers  limit  X°  by  proving  a  large  deviation 
principle  and  finding  the  corresponding  rate  function. 


Definition  2.2  Let  {Zn ,  n  E  N}  be  a  sequence  of  random  variables  defined 
on  a  probability  space  (Ll,!F,P)  and  taking  values  in  a  Polish  space  Z.  A 
function  I  :  Z  — >  [0,  oo]  is  called  a  rate  function  if  for  any  M  <  oo  the 
set  {x  :  I{x )  <  M}  is  compact  in  Z.  The  sequence  {Zn}  satisfies  the 
large  deviation  principle  on  Z  with  rate  function  I  and  sequence  r(n )  if  the 
following  two  conditions  hold. 

•  Large  Deviation  Upper  Bound:  for  each  closed  subset  F  of  Z 

limsupr(n)  logP(Zn  G  F)  <  —  inf  I(z). 

n— >oo  zG-F 

•  Large  Deviation  Lower  Bound:  for  each  open  subset  G  of  Z 

lim  inf  r(n)  log  P(Zn  G(?)>-  inf  I(z). 

n— >oo  z&G 

Under  significantly  stronger  assumptions,  including  the  assumption  that 

sup  Hc(x,  a)  <  oo 

x£Rd 


6 


10 


2nd  August  2014 


for  all  a  £  Rrf,  it  has  been  shown  that  Xn{t)  satisfies  the  large  deviation 
principle  on  C([0, 1]  :  Rrf)  with  sequence  r(n )  =  1/n  and  rate  function 

Lc((j)(s),  u(s))ds  :  (j)  (t)  =  xo  +  /  b(fi(s))ds 

Jo 

+  /  u(s)ds ,  t  £  [0, 1] 

Jo 

where 

Lc(x,  /3 )  =  sup  {(a,  (3)  -  Hc(x,  a)} 

aSMd 

is  the  Legendre  transform  of  Hc(x,a)  [12,  21,  22,  23,  24]. 

Assume  a(n)  satisfies 

a(n)  — >  0  and  a{n)\fn  — ►  oo.  (2.6) 

We  define  the  rescaled  difference 

Yn{t )  =  a(n)VJi(Xn(t)  -  Xn’°(f)). 

As  noted  in  the  introduction,  the  result  stated  below  also  holds  with  the 
interval  [0,1]  replaced  by  [0,T],  T  £  (0,  oo).  Let  D  denote  the  gradient 
operator. 


Il{4>)  =  inf  j  J 


Theorem  2.3  Assume  Condition  2.1.  Then  {Y"n}ngN  satisfies  the  large 
deviation  principle  on  C([0, 1]  :  Md)  with  sequence  a{n )2  and  rate  function 

Im{4>)  =  inf  j  IK^H 2  dt  :  fi(t)  =  j  Db(X0(s))(f>{s)ds 

+  J  A1/2(X°(s))u(s)ds,t  £  [0,1]}. 


Im  is  essentially  the  same  as  what  one  would  obtain  by  using  a  linear 
approximation  around  the  law  of  large  numbers  limit  X°  of  the  dynamics 
and  a  quadratic  approximation  of  the  costs  in  II-  To  prove  the  LDP,  it 
suffices  to  show  the  Laplace  principle  [12,  Theorem  1.2.3] 


lim  —  a(n)2  log E 


inf 

ueL2([0,l]:I 


— f-rzFXn) 

g  a(n)z  v  7 


£  lk(s)f  ds  +  F  (V1/2(-Y>)  }  (2.7) 
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where 


Note  that 


vn  _  \rn  . 

xi+ 1  —  xi  “r 


4>u  (t)  =  f  Db(X°(s))(j)u(s)ds  +  f  u(s)ds.  (2.8) 

Jo  Jo 

a(n)  ^ ^f)  -  6(Af’0))  +  ^vi(x? ),  y0n  =  o 


For  r/.  /x  G  P(M<i)  [the  set  of  probability  measures  on  £>(Md)]  ,  the  relative 
entropy  of  r/  with  respect  to  n  is  defined  by 

#(»7ll  A*)  =  J^d  log  (0^))  e  [°>  °°] 

if  r/  is  absolutely  continuous  with  respect  to  /x,  and  i?(r?||  /u)  =  oo  otherwise. 
For  general  properties  of  relative  entropy  we  refer  to  [12,  Section  1.4].  The 
variational  formula  [12,  Proposition  1.4.2(a)]  and  chain  rule  [12,  Theorem 
C.3.1]  imply  that 


—a[n 


:  log  E 


r  -r^F(Yn)] 

g  a(n)z  v  7 

=  inf  E 

- 

V 

'n—  1 

E 


a(n)2R(Vi\\iiXn)  +  F(Yn) 


(2.9) 


for  any  bounded,  continuous  F  :  C([0, 1]  :  Md)  — >  M.  Here  r)  G  V((M.d)n)  is 
the  joint  distribution  of  (bo, . . .  ,Dn_i),  r/i(-)  is  the  conditional  distribution 
on  Vi  given  (b0, . . .  ,bj_i), 


Af+1  =  x?  +  E(xf )  +  1  vu  X£  =  xo,  (2.10) 

n  n 

YT+i  =  Y?  +  ^  (b(X?)  -  6(Af’0))  +  Y0n  =  0  (2.11) 

and,  similar  to  (2.5),  Xn[t)  and  Yn(t)  are  the  continuous  time  linear  inter¬ 
polations  of  {X"}j=o,...,n  and  {yn}i=o,...,n-  Note  that  rji  depends  on  past 
values  of  the  noise,  but  we  suppress  this  dependence  in  the  notation.  We 
will  prove  (2.7)  by  proving  the  lower  bound 


lim  inf  —a{n)2  log  E 


>  inf 
ueL2([0,l]:I 


— iar^W”) 

g  a(n)z  v  7 

1 


J  ||u(s)||2ds  +  F^1/2(-Y°)“)|  (2.12) 
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and  the  upper  bound 

lim  sup  — a(n )2  log  E 

n— >oo 

<  inf  { - 

«eL2([0,l]:Rd)  [  2 


g  a(n) 


?F(Yn) 


_l 

J  0 


2(X°) 


(2.13) 


We  will  use  a  tightness  and  weak  convergence  result  in  the  proofs  of  both  of 
these  bounds,  but  first  establish  notation  used  in  the  rest  of  the  paper. 


Construction  2.4  Given  a  sequence  of  measures  {yn}ne N  with  each  rjn  G 
P((Md)n),  define  the  following.  Let  (vq,  . . . ,  v^-i)  be  random  variables 
with  distribution  rf1 ,  and  define  {W”}j=o,...,n  and  {Y)n}j= by  (2.10)  and 
(2.11).  Let 

Xn(t)  =  (i  +  1  -  nt)X?  +  (nt  -  i)X?+1 

and 

Yn(t)  =  (i  +  1  -  nt)?”  +  (nt  -  i)Yxn+l 

for  t  G  [*/n,  i/n  +  1/n],  i  =  0, . . .  n  —  1  be  their  continuous  time  linear 
interpolations.  Define  the  conditional  means  of  the  noises 

wn(t)  =  [  yrff(dy)  for  t  6 

J  Rd 

the  amplified  conditional  means 

wn(t)  =  a(n)y/nwn  (t) , 

and  random  measures  on  <g)  [0, 1]  by 

rf  (dy  <g>  di)  =  5^n[t)(dy)dt  =  Sa{n)Vriwn{t)(dy)dt. 

We  will  refer  to  this  construction  when  given  r/n  to  identify  associated 
Xn,Yn,wn  and  i)n.  Given  u  G  V(E\  x  E2),  with  each  Ei,i  =  1,2  a  Pol¬ 
ish  space,  let  1/2  denote  the  second  marginal  of  u,  and  let  iq|2  denote  the 
conditional  distribution  on  E\  given  a  point  in  E2. 


i  i  +  1 
n’  n 


Theorem  2.5  Let  {r]n}  be  a  sequence  of  measures,  eachyn  G  'P((Md)n);  and 
define  the  corresponding  random  variables  as  in  Construction  2.f.  Assume 
that  for  some  Ke  <  00 

<  Ke.  (2.14) 


sup 

neN 


a(n)2nE 


71—1 


n 


hxv 


i= 0 
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Then  {(?)n,  Pn)}neN  is  tight  in  V(M.d  <8>  [0, 1])  <8>  C([0, 1]  :  Md).  Consider  a 
subsequence  ( keeping  the  index  n  for  convenience)  such  that  { ( 77” ,  Yn)}  con¬ 
verges  weakly  to  ( r),Y ).  Then  with  probability  1  ij2 (dt)  is  Lebesgue  measure 
and 


Db(X°  (s))Y  (s)ds  + 


w(s)ds, 


(2.15) 


where 


w(t) 


yvi\2{dy\t). 


In  addition , 


liminf  a(n)2nE 

n— >00 


1 

n 


n—  1 


Ew 


>  E 


w(s 


A-HXO(s)) 


(2.16) 


3  Proof  of  Theorem  2.5 

Assume  that  the  bound  (2.14)  holds.  We  will  show  tightness  of  the  {r)n} 
measures  using  the  following  lemma. 


Lemma  3.1  Assume  Condition  2.1  and  let 

Lc(x,  (5)  =  sup  {(a,  (3)  —  Hc(x,  a)}  (3.1) 

aeIRd 


be  the  Legendre  transform  of  Hc(x,  •).  Then  for  any  x  £  and  g  £  P(Md) 


\\Tx)  >  Lc 


Proof.  While  the  result  is  likely  known  we  could  not  locate  a  proof  (see 
[12,  Lemma  6.2.3(f)]  for  a  proof  when  Hc(x,a )  is  finite  for  all  a  £  Rrf), 
and  so  for  completeness  provide  the  details.  If  R(g\\  /xx)  =  00  the  lemma  is 
automatically  true,  so  we  assume  R(g ||  fix)  <  00.  Define  1(h)  =  61og6— 6+1 
and  note  that  for  a,  b  >  0 

ob<ea  +  l(b).  (3.2) 


From  (2.1)  we  have 


Hx(dy)  <  2dedK'iazi  <  00. 
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Therefore 


,fiw  ^t(yMdy) 

<[  e%Mnx(dy)+  [  t((1y-{y)  )  nx(dy) 

J  Rd  J  Rd  \dfi 


<  2dedK^  +  R(rj\\yx), 
and  consequently  for  any  a  €  Rd 


all  llz/ll  ^(y)dx(dy)  <  (2 dedK ”gf  +  R(v\\  vj)  <  °°-  (3.3) 


2d  II CK 


rfMx  A 

Define  the  bounded,  continuous  function 

„  ,  v  _  /  (a,y>  if  |(a,y)|  <  K 

K  Vl  a  ~~  (  tStt  otherwise> 

and  note  that  (3.3)  and  dominated  convergence  give 


Jim  /  FK(y,a)y(dy)  =  (a,  yy{dy)) 
K^OOJ  Rd  \  J  Rd  / 


In  addition,  dominated  convergence  gives 


lim  /  eF^y'a)  yx{dy)  =  /  >’»> 

K_>00  </{»/:  (a, ?/)<0}  4{y:(a,i/)<0} 


e{a,y}dx(dy) 


and  monotone  convergence  gives 


lim  /  e 

A’->oo^{l/:(a,l/>>0} 


^^)/ia,(dj/)=  f  e^iix(dy), 

d{y-(a,y)>  0} 


lim  log  (  [  eF,<('xy,a'>  fix  (dy))  =  Hc(x,  a). 

K^oo  \JRd  J 

By  the  Donsker-Varadhan  variational  formula  [12,  Lemma  1.4.3(a)] 

R(V II  dx)  >  [  FK(y,a)y{dy)  -  log  (  /  eFK{y’a) yx(dy)\ 

JRd  \JRd  J 

for  all  K  <  oo  and  a  G  M.d,  and  so 

R(v\\  dx)  >  sup  \  (a,  /  yy(dy)  )  -  Hc(x,  a)  \  =  Lc  (  a,  /  yy(dy)\  , 
aSRd  l  \  4Rd  /  J  V  ./Rd  / 
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which  completes  the  proof  of  the  lemma.  ■ 

The  lemma  implies  the  following  theorem,  which  in  turn  will  give  tight¬ 
ness  of  {r)n}. 


Theorem  3.2  Assume  Condition  2.1  and  (2.14).  For  the  processes  {wn} 
obtained  in  Construction  2-4 


In  addition,  {a(7i)Vnwn(-)}n&fq  is  uniformly  integrable  in  the  sense  that 


lim  limsup-E 

C— >oo  n— >oo 


l{a{n)^\\w^{s)\\>C}a(yn)Vn  |K(s)||  ds 


=  0. 


Proof.  We  use  the  following  inequality.  Let  G  >  0  satisfy  Ada  m i nn e p:; {a ( n ) \Jn } 
VG  [recall  (2.6)]  so  that  A  da  >  VG/a(n)^/n  for  all  n.  Define  Lc  by  (3.1). 

Let  K  =  \daKda  +  Ka/2-  Then  with  e*  denoting  the  standard  unit  vectors 


a(n)2nLc(x,  (5) 

=  sup  [a(n)\/n  (a,  a(n)y/nf3)  —  a(n)2n,Hc(x,  a)] 
a£Rd 


>  ±a(?r)\/n  < 


'  Vg 
a(n)Vn 

1 


ej,  a(n)y/nf3  }  —  a(n)2nHc  (  x,  ± 


Vg 

a(n)Vn 


>  ±VGa(n)  V^V  -  -G  ||A(x)||  -  G\DAKDA 

>  ±V/Ga(n)y/n/3i  —  GK, 


where  the  first  inequality  follows  from  making  a  specific  choice  of  a  and  the 
second  uses  (2.4).  Therefore 

da(n)2nLc(x,  j3)  +  dGK  >  V~Ga(n)Vn  ||/3||  .  (3.4) 

Using  the  bound  on  Lc  from  Lemma  3.1  together  with  (2.14)  and  the  last 
display, 
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For  the  uniform  integrability,  let  C  £  (l,oo)  be  arbitrary  and  consider 
n  large  enough  that 

y/C 

ruin  (A  du,  1}  >  ,  t  ~j=- 

a{n)sjn 

Since  Ada  >  l/a(n)y/n  the  derivation  leading  (3.5)  holds  for  G  =  1,  and 
therefore 


E 


a(n)y/n  ||u;n(s)||  ds 


<  K*  =  d  (  Ke  +  -KA  +  A daKda  )  , 


which  implies 


E 


1{a(n)yH||«;"(s)||>C}ds 


< 


K* 

~C 


Since  A  da  >  VC/a(  n)\fn  the  estimate  (3.4)  holds  with  G  replaced  by  C, 
and  then  the  last  display  and  (3.5)  give 


VCE 


Uo 


1{a(n)y/fl\\w^{s)\\>C}a(n)Vn  ||wn(s)||  ds 


<  E 


d  /  l{a(n)A/n||w"(s)||>C}  (  a(n)  nLc  (  A 


[nsj 


n 


,wn{s )  +  CK  ds 


<  da(n)2nE 


LAX 


14  0 


[rosj 


n 


,  wn(s )  )  ds 


+  CdKE 

U0 

<  K*d  (1  +  K)  . 
We  conclude  that 


1{a(n)yH||«;"(s)||>C}^s 


lim  limsupA' 

C^oo  n — >oo 


1{a(n)V^\\w^(s)\\>C}a{n)Vn  ||^n(s)||  ds 

which  is  the  claimed  uniform  integrability.  ■ 


=  0, 


We  continue  with  the  proof  of  Theorem  2.5.  Note  that  g(y,t)  =  ||y||  is  a 
tightness  function  on  <S>  [0, 1],  so  by  [12,  Theorem  A. 3. 17] 


G(v)  = 


imd( 8>[o,i] 

is  a  tightness  function  on  T’(Md  <8>  [0, 1])  and 


r](dy  <g>  dt ) 


Gin)  = 


'p(Rd®[0,l])  4Rd(g>[0,l] 


||y||  v(dy  ®  dt)j(dri) 
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is  a  tightness  function  on  V{V(M.d  <g)  [0, 1])).  Since 


sup  EG(fin)  =  sup  E 

neN  neN 


\\y\\vn(dy®dt) 


sup  E 

ne  N 


a{n)y/n  ||u;n(s)||  ds 


<  oo, 


{f/n}  is  tight  and  consequently  there  is  a  subsequence  of  {f/n}  which  con¬ 
verges  weakly.  To  simplify  notation  we  retain  n  as  the  index  of  this  conver¬ 
gent  subsequence,  and  denote  the  weak  limit  of  {r)n}  by  fj.  Note  that  for  all 
n  the  second  marginal  of  fjn(dy<g)dt),  which  we  denote  by  fjjidt),  is  Lebesgue 
measure,  and  therefore  f]2{dt)  is  Lebesgue  measure  with  probability  1. 

Our  aim  is  to  show  that  Yn(t)  — >  Y(t)  weakly  in  d7( [0, 1]  :  Md),  where 
Y(t )  is  given  by  (2.15)  in  terms  of  the  weak  limit  ?).  To  achieve  this  we 
introduce  the  following  processes  which  serve  as  intermediate  steps.  Let 
To"  =  0  and 


N+i 


=  y'1 


a(n ) 


n 


X?’°  + 


a{n 


Y! 


n 


+ 


together  with  its  continuous  time  linear  interpolation  defined  for  t  £  [i/n,i/n+ 


1/n]  by 

Yn(t)  =  (i  +  1  -  nt)Y \n  +  {nt  -  i)Y?+1. 

Also  let 

f*t  pt 

Yn(t)  =  /  Db(X°(s))Yn(s)ds+  /  wn(s)ds 

Jo  Jo 

(3.6) 

where 

wn(t)=f  yfini\2(dy\t) 

J  Rd 

as  in  Construction  2.4.  These  are  both  random  variables  taking  values 
in  C([0, 1]  :  Md).  Note  that  Yn  differs  from  Yn  because  Yn  is  driven  by 
the  actual  noises  and  Yn  is  driven  by  their  conditional  means.  While  the 
driving  terms  of  Yn  and  Yn  are  the  same  [recall  that  a(n)y/nwn{t)  =  wn(t)\, 
they  differ  in  that  Yn  is  still  a  linear  interpolation  of  a  discrete  time  process 
whereas  Yn  satisfies  an  ODE.  The  goal  is  to  show  that  along  the  subsequence 
where  f/n  — >  f)  weakly 


yn  _  yn  ->  0,  Yn  -  Yn  -*•  0,  and  Yn  ->  Y 


in  C([0, 1]  :  Md),  all  in  distribution.  To  show  Yn  —>  Y  we  show  that  {Yn} 
is  tight  in  C([0, 1]  :  Md)  and  use  the  mapping  defined  by  (3.6)  from  fQ  wn  to 
Yn.  Recall  that  supxeRfi  ||D6(x)||  <  Kb-  The  following  lemma  is  an  easy 
consequence  of  Gronwall’s  inequality. 
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Lemma  3.3  Let  u  £  L1([0, 1]  :  Rd)  be  arbitrary  and  4>u  be  defined  as  in 
(2.8).  Then  for  0  <  s  <  t  <  1 

\\fiu(t)  -  fiu{s)\\  <  (t  -  s)KbeKb  [  \\u(r)\\dr+  f  '  \\u(r)\\dr. 

Jo  J  s 

With  this  lemma  and  the  uniform  integrability  of  {f)n}  given  in  Theorem 
3.2,  tightness  follows. 

Lemma  3.4  Assume  Condition  2.1  and  (2.  If).  The  sequence  {Yn}  defined 
in  (3.6)  in  terms  of  the  measures  {rjn}  via  Construction  2.f  is  tight  in 
C([0, 1]  :  Md),  as  is  {/0  wnds}. 

Proof.  It  suffices  to  show  that  for  any  £  >  0  there  is  5  >  0  such  that 


limsupP  sup 

n^°o  \|s-i|<<5 


Yn(t)  -  yn(s) 


>  £  <  £. 


Since  ff1  is  the  integral  of  a  point  mass  located  at  wn(t )  , 


T(C)  =  limsupP 

n— >oo  L-'O 


1{||«)"(t)||>c}  ll^n(0ll  dt 


=  lim  sup  E 


'(112/11  >C} 


\\y\\  r\n{dy  <g>  dt) 


By  Theorem  3.2  T(C)  — >  0  as  C  — >  oo.  Define  also  Kv  =  supngN  P  J ^  ||'icn(t)||  dt , 
which  is  finite  by  Theorem  3.2.  Let  e  >  0  be  arbitrary.  Then  for  any  s  <  t 
satisfying  t  —  s  <  5  Lemma  3.3  implies 


Yn(t)  —  Yn(s)\\  <  8KbeKb  I  \\wn(r)\\dr+  I  \\wn(r)\\dr. 


Since 


it  follows  that 


\wn{r)\\dr  <  C5+  [  l{||tS"(r)||>C7}  II^WII 
Jo 


Yn(t)  -  Yn(s )  <  6  (  C  +  KbeI<b  /  ||tcn(r)||  dr 


+  /  l{||^(r)||>c}  ||u)n(r)||  dr. 
Jo 
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Hence  by  Markov’s  inequality 


lim  sup  P  sup 

n^oo  \Js-t|<<5 

£ 

<  -  lim  sup  E 

£  ra— >oo 


H —  lim  sup  E 

£  n— xx) 


>  5 


Yn(t )  -  yn(s) 

C  +  KbeKh  \\wn{r)\\dr 


Uo 


l{||*«(r)||>C}  \\wn{r)\\dr 


<  5-{C  +  KbeKbKv)  +lT{C). 

Choose  C  <  oo  such  that  T(C)  <  £2/2  and  then  choose  5  >  0  so  that  the 
5(C  +  Kf,eKbKv)  <  e2 /2.  This  shows  the  tightness  of  {Yn}.  The  tightness 
of  { fQ  wnds }  is  simpler,  and  follows  from  the  bound 


lim  sup  P 


(  sup  f  \\wn(r)\\  dr  >  e\  <5-  +  -T(C). 

\ls— 1\<8  J s  J  £  £ 


We  still  need  to  show  that  Yn  converges  to  Y .  This  also  relies  on  the 
uniform  integrability  given  by  Theorem  3.2. 

Lemma  3.5  Assume  Condition  2.1  and  (2.14).  Let  the  sequence  {Yn  (f)} 
be  defined  by  (3.6),  consider  a  convergent  subsequence  {( Yn,fjn )}  with  limit 
(' Y*,f] ),  and  let  Y(t)  be  defined  by  (2.15).  Then  w.p.l  Y*  =  Y. 


Proof.  We  can  write 


Yn(t ) 


Db(X°(s))Yn{s)ds  + 


yf/n(dy  (g)  ds). 


Using  the  uniform  integrability  proved  in  Theorem  3.2  and  that  r)2  is  Le- 
besgue  measure  w.p.l,  sending  n  — >  oo  and  using  the  definition  of  w  gives 


Y*(t)  =  /  Db{X°(s))Y*(s)ds  + 


/  0  J  Rd 

rt 


yf){dy  <8>  ds) 


Db(X"(s))Y*(s)ds+  /  w(s)ds. 


By  uniqueness  of  the  solution,  Y*  =  Y  follows.  ■ 
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It  remains  to  show  Yn  —  Yn  — ■>  0  and  Yn  —  Yn  — ■>  0.  We  begin  with 
Yn  —  Yn  — ►  0.  Recall  that  the  difference  between  Yn  and  Yn  is  that  the  first 
is  driven  by  the  actual  noises  and  the  second  is  driven  by  their  conditional 
means.  The  following  theorem  is  a  law  of  large  numbers  type  result  for  the 
difference  between  the  noises  and  their  conditional  means,  and  is  the  most 
complicated  part  of  the  analysis. 


Theorem  3.6  Assume  Condition  2.1  and  (2.14).  Consider  the  sequence 
{vf  }i=o,...,n-i  of  controlled  noises  and  {wn(i/n)}i= o,...,n-i  of  means  of  the 
controlled  noises  as  in  Construction  2-4 •  For  i  G  {1, . . .  ,n}  let 

1  ,_1 

Wr±-Y,a(n)yfrW-wn(iM)- 

U  3=0 

Then  for  any  6  >  0 


lim  P  max  ||R/iri||  >  <5 

n->oo  1 1 "! 


=  0. 


Proof.  According  to  (2.14) 


1 

n 


n—  1 

i= 0 


< 


Ke 

a2(n)n 


Because  of  this  the  (random)  Radon-Nikodym  derivatives 


my) 


(y) 


are  well  defined  and  can  be  selected  in  a  measurable  way.  We  will  control 
the  magnitude  of  the  noise  when  the  Radon-Nikodym  derivative  is  large  by 
bounding 

j  TO— 1 

i= 0 


for  large  r. 

From  the  bound  on  the  moment  generating  function  (2.1), 


sup  [  e^M/j.x(dy)  <  2dedK^(.  (3.7) 

xGM.d  J  Md 


Let 

a  =  min{A/2d+1, 1}  (3.8) 
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and  recall  the  definition  i{b)  =  b  log  6  —  6+1.  Then 


1 


n 


n— 1 

i= 0 


^  n— 1 

=  s££ 

[  \\y\\fil(y)V'Xy-(dy) 

i= 0 

/{  v-fT(v)>r} 

and  the  bound  ab  <  ea  +  t{b)  for  a,  6  >  0  with  a  =  a  ||y||  and  b  =  /"'(y) 
gives  that  for  all  i 


E 


i{y-fi{y)>r} 


<  -E 
a 


1 1 2/ 1 1  fil(y)fJ-x9(dy) 


'{y-fii  y)>r} 
Since  £(b)  >0  for  all  6  >  0 


e"'  lyllVxn(dy) 


Ee 

a 


'{y-fr(y)>r} 


(-{fi{y))yx^{dy) 


E 


( dy ) 


<  E 


Z(fi(y))Hx™(dy) 


L  J{y-fHy)>r} 

=  E[R(rft\\  Hxv-)]i 

and  by  Holder’s  inequality  (recall  (3.7)  and  (3.8)) 


E 


'{y-fi{y)>r} 


^n(dy) 


<  E 


<  E 


^{fr(y)>r}fJ'X?l(dy) 


\vx™({y :  f?(y )  >  rl)5 

L  1  j  \ 

In  addition  Markov’s  inequality  gives  for  r  >  e_1 

1 


§  (  [  e^M^n(dy) 

\jRd 

2dedKmgf\  ^ 


v-xr>({y  ■  J?{y)  >  r})  < 


r  log  r . 


log  Ui(y))fiXy)yx™(dy)  = 


R(y?\\  +A+) 

r  logr 


Therefore 

^  71—1 

-Ee 

n 


n 


i=0 


i{fHy)>r} 


fi{y)yx™{dy) 


i  i  i  n_1 

<  —  (2dedKmgf^J  2  —  ^  E 

n  i=0 


r  logr 


1  1 


n—  1 


i  II  /+Y" 


i= 0 
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Since  by  Jensen’s  inequality 


71—1 


n 


i= 0 


Riv?  II  ^xv-)\ 

r  log  r  I 


< 


r  log  r 


71—1 


n 


i=0 


we  obtain  the  overall  bound 

j  n—  1 


n  E  ^{/r(v")>r}  ll^i 


*=0 


<  i  ( 2dedK™st 


a 


§  /r-1 


rlogr 


n 


i= 0 


1  1 


71—1 


i=0 


<  - 


1  Kl 


^2dgdKmgf^j  2  ^ 


1 


2  _|_  1  Ke 


a  a(n)\/n  \  )  \rlogr  J  ’  a  a{n)2n 

Using  this  result  we  can  complete  the  proof.  Define 


>71.,  7 


0  otherwise. 


For  any  for  any  5  >  0 


P  <  max 

I  k=0,...,n—  1 


n 


^a(r 


i=0 


<  P  {  max 

I  k=0,...,n—  1 


n 


5^a(n)>/ra(u"  -  £ 


n,r>, 

7  > 


2=0 


>  3<5 


>  <5 


H 

f 

max 

k=0,...,n—  1 

i= 0  ' 

H 

f 

max 

k=0,...,n— 1 

i=0  ' 

'{v-f?(v)<r} 


Wi(dy) 


(3.9) 


>  <5 


'{y-fr(y)<r } 


yvi{dy) 


>  5 
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The  first  term  satisfies 


P  <  max 

I  k=0,...,n—  1 


i£a(n)V5 

t=0 

n—1 

<  -a(n)Vn-  ^  E  [\\v^  -  £",r||] 


>  <5 


i= 0 

n—1 


i=0 


{/r(^)M  lr 


The  second  term  is  a  submartingale.  The  first  inequality  in  the  next  display 
follows  from  Doob’s  submartingale  inequality.  The  second  inequality  uses 
a  conditioning  argument  and  that  for  any  integrable  random  variable  Z, 
E[Z  -  EZ]2  <  EZ2.  We  have 


P  <  max 

I  k=0,...,n—  1 


n)Vn  H7- 


Py-fr(y)<r} 


Wi(dy) 


<  —9  E 

~  52 


1  a(n)“ 
d'2  n 

1  a(n) 


<  — - 

<S2  n 

1  a(n) 
4'2  n 

r  a(n) 
(52  n 


1 

-  Va(r 

n 

i= 0 

-  a(n)  v/n  (  ^"’r  -  f  yrft(dy) 

1  i= 0  \  •'{j/:/"(j/)<r} 

n—1 

Ee 

i= 0 
2  n-! 

Ee 

L"  "  J 

r  , 

2  f?(y)Hx™{dy) 


>  5 


£n,  k 


>{y-fi(y)<r} 


VVi(dy ) 


i= 0 

2  n-1 


3 


■n,k 


Y.E 


i= 0 

2  n-! 


< 


Ee 

i= 0 


l|y||2  l-lxn(dy) 


<  ~^a{n)2K^2, 


where 


Kn,  2  =  sup 


a;eRd  </Kd 


>*(*/)  <  OO, 
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and  the  finiteness  is  due  to  (2.1).  We  can  use  Jensen’s  inequality  with  the 
third  term  and  get  the  same  bound  that  was  shown  for  the  first.  We  have 
k  / 


P  <  max 

I  k=0,...,n—  1 


i 


71 


i= 0 
n—  1 


n)y/n  I  wn  (  -  -  /  yrii(dy ) 

n '  J{y-fi(  y)<r}  . 


>  8 


<  -a(n)y/n-  E 
-  6  y  n 

i= 0 

1  1  n_1 

=  -a(n)^/n-  A1 
8  v  n  ^ 

«=o 

1  1  n_1 


w  I  -  I  - 

71 


i= 0 
n— 1 


'{2/:/r(2/)>r} 


0d<r} 

yVi(dy) 

\Vi(dy ) 


yVi(dy) 


=  \a{n)\/n-  E 
A  v  ' v  n  ^ 


{/"(v”)>r}  Wi 


i=0 


Combining  the  bounds  for  these  three  terms  with  (3.9)  gives 

k 


P  <  max 
I  fc=0,...,n— 1 


^a(n)  v/n  fu? 


i=0 

n—l 


>  3J 


<-za(n)^n-^E  l{/r*(^)>r]  11^11  +  ~^a{ji)2K^2 

0  Tl  L  -I  0 


2=0 


<  A/i  I  ( 2dedK"‘*r 


1 


2  i  2  ,  (  \1  r  v 

H  7=  +  a(n)  72-^M>2- 


08  E  V  /  \rlogr/  a8  a{ri)y/n  5 2  M 

Choosing  r  =  l/a(n)  and  using  a(n)  — >  0,  a(?r)y/n  — >  oo  gives 


P  <  max 

I  /c=0,...,n—  1 


i 


n 


^a(7 


i=0 


>  3<5 


0 


as  n  -mx),  which  completes  the  proof.  ■ 

This  theorem,  combined  with  the  following  discrete  version  of  Gronwall’s 
inequality,  will  allow  us  to  prove  Yn  —  Yn  — >  0. 


Lemma  3.7  If  {an},  {&„},  and  {cn}  are  nonnegative  sequences  defined  for 
n  =  0, 1, . . .  and  satisfying 

n—l 

(Lyi  Yi  Cn  T  ^  (  ftfcQ'/o 
k= 0 
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then 


71—1 


71—1 


<cn  +  ^2  bk°k  eXP  1  bi 


k= 0 


Ki=k-\- 1 


Theorem  3.8  Under  the  conditions  of  Theorem  3.6  Yn  —  Yn  — ►  0  in  prob¬ 
ability. 


Proof.  Recall  that 
k- 1 


a(n) 


i=0 


n 


b[X?°+  .  ,  r > 

a\n)\Jn 


1 


Kn  )-b[X. 


rtl,  0 


fc-1 


+  E 


an 


i=0 


n 


and 


k- 1 


hn  =  E 


a(n) 


i=0 


n 


b  (  Af’°  +  *  ]  -  b  ( x! 

a(n)y/n 


n,  0 


k— 1  /  \ 

+  ^a(n)^n 


*=o 


n  \  n 


so  with  Wu  defined  as  in  Theorem  3.6 


k-l 


I  ykn  -  Yk\  I  <  itch  +  E  —  ll^n  - 

j=0  n 


Using  Lemma  3.7  gives 


k-l 


\n  -  nil  <  itch + E  itch  v exp  ^ ^  “ 1} 

i= 0 


n 


<  (1  +  Kbe  b )  max  {|| W- 

»e{i,...,fc} 


so 


max  {||F”  -  Yj» ||}  <  (1  +  max  {||»7*||}. 

*e{l,...,n}  *e{l,...,n} 

Since  maxjg.n  n\ {||  W/1!!}  — >  0  in  probability 

max  { || Y™  —  i^n|||  — ■>  0  and  hence  sup  ||Un(t)  —  Tn(t)||  — »•  0 
'>  1 1 . "!  te[o,i] 


in  probability.  ■ 

To  complete  the  proof  of  the  convergence  we  need  to  show  Yn  —  Yn  — >  0. 
Recall  that  these  two  processes  have  the  same  driving  terms  but  different 
drifts,  in  that  Yn  satisfies  the  ODE 

Yn(t)=  [  Db(X° (s))Yn (s)ds  +  [  wn(s)ds 
Jo  Jo 
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while  Yn  is  the  linear  interpolation  of  the  discrete  time  process  defined  by 
Yq  =  0  and 


vn  _ Vn  _L_ 

xi+ 1  —  *i 


an 


n 


b  (  Xn’°  + 


an 


n 


Kn  -MI 


rTlfi 


+  -W,L  - 


n 


n 


However,  essentially  the  same  arguments  as  those  used  in  Lemma  3.4  to 
show  tightness  of  {Yn}  can  be  used  to  prove  tightness  of  {Yn},  and  then 
it  easily  follows  as  in  Lemma  3.5  that  any  limit  will  satisfy  the  same  ODE 
(2.15)  as  the  limit  of  {yn},  and  therefore  Yn  —  Yn  — >  0  follows. 

Combining  Yn  —  Yn  — >  0,  Yn  —  Yn  — >  0,  and  Yn  — >  Y  demonstrates 
that  along  the  subsequence  where  ff1  — ►  f)  weakly  Yn  — >  Y  in  distribution, 
which  implies  that  along  this  subsequence  ( f/n,Yn )  — >  (r),  Y)  weakly.  We 
have  already  shown  that  with  probability  1  f]2(dt)  is  Lebesgue  measure  and 


Y(t)=  /  Db{X°(s))Y(s)ds  + 


yfli\2(dy\t)ds, 


so  the  proof  of  convergence  (i.e. ,  the  first  part  of  Theorem  2.5)  is  complete. 

To  finish  Theorem  2.5  we  must  lastly  show  the  bound  (2.16).  Note  that 
the  weak  convergence  of  Yn  implies 


sup  ||Xn(|_niJ  /n)  —  X°(t)||  — >•  0  in  probability.  (3.10) 

ft  [o.  1 


Define  random  measures  on  <8>  (g>  [0,1]  by 


7n  (dx  ®dy®  dt)  =  ^(Lntj/n)  ( dx )  if  idV  ®  dt)  ■ 

Note  that  the  tightness  of  {7n}  follows  easily  from  (3.10)  and  from  the 
tightness  of  {r)n}.  Thus  given  any  subsequence  we  can  choose  a  further 
subsequence  (again  we  will  retain  n  as  the  index  for  simplicity)  along  which 
{7n}  converges  weakly  to  some  limit  7  on  V  (Rd  <8)  Rd  <8>  [0, 1])  with 

72,3  (dy  0  dt)  =  y  ( dy  <g>  dt) , 

where  723  is  the  second  and  third  marginal  of  7.  If  we  establish  (2.16)  for 
this  subsequence  it  follows  for  the  original  one  using  a  standard  argument 
by  contradiction.  For  a  >  0  let 

Ga°  =  {(x,y,t)  :  \\x  —  X°  (t)||  <  cr} 

be  closed  sets  centered  around  X°  (t)  in  the  x  variable,  and  note  that  by 
(3.10)  and  weak  convergence,  for  all  a  >  0 


1  =  lim  sup  E  7 n[G\ 


-1X0 


<  E 


7  (of 
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Thus 


=  1 


so,  with  probability  1,  7  puts  all  its  mass  on  {(x,  y,t)  :  x  =  X°  (t)j.  There¬ 
fore  with  probability  1,  for  a.e.  (y,  t )  under  y2  3  (dy  (g>  dt), 


7i|2,3  (dx\y,t)  =  Sx o(t)  (dx). 

Combined  with  the  fact  that  the  second  marginal  of  77  ( dy  <8>  dt)  is  Lebesgue 
measure,  this  gives 


7  (dx  <8)  dy  <8>  dt)  =  6x°(t)  (dx)  fj  ( dy|  t)  dt. 


(3.11) 


Let 


1  2 

LK{x,(3 )  =  sup  (a,/3)  -  -  ||a|U(a.) 

aeMd  l  2  2K 

Then  (2.4)  implies  that 


1  "«ll2 


lim  inf  a(n)2nLc  x, 


1 


P  >  Lk  (x,  P) 


a(n)^/n 

uniformly  in  x  and  compact  subsets  of  p.  We  also  have 


(3.12) 


LK  (x,P)  T  ^  \\P\\a-1(x) 


as  K  — >  00  for  all  (x,P)  6  M2rf.  Combining  (3.12)  with  Lemma  3.1  and 
using  Fatou’s  lemma  for  weak  convergence, 


lim  inf  a(n)2nE 


>  lim  inf  E 

n— >00 


n—1  - 

Y  - R 

“  n 


r  11 


U=o 


1 


a(n)2nLc  lx,  t 

1]  V  a{n)y/n 


y  7n  (dx  <g>  dy  <g>  dt) 


>  E 


Lk  ( x ,  y)  7  (dx  (g>  dy  <8>  dt) 
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for  all  K.  Then  using  the  monotone  convergence  theorem,  the  decomposi¬ 
tion  (3.11),  and  Jensen’s  inequality  in  that  order  shows  that 


liminf  a(n)2nE 


'n—  1 


>  lim  E 

K — >00 


II  «7 

Lx  ( x ,  y )  7  ( dx  <g>  dy  <g>  dt) 


U=o 


/Rd(g>R‘i(g>[0,l] 


=  E 

=  E 

>  E 

which  is  (2.16). 


/]Rrf<S)Rd<S)[0,l] 


1 


Uo 


1  2 

9  \\y\\A-^x)'l(dx®dy®dt) 


\2A-Hx°(t))V(dy\t)  dt 


\'d’(t)\\2A-i(x°(t))dt 


4  Laplace  Upper  Bound 


The  goal  of  this  section  is  to  prove  (2.12),  which  due  to  the  minus  sign 
corresponds  to  the  Laplace  upper  bound.  Suppose  for  each  n  that  rf1 
comes  within  e  of  achieving  the  infimum  in  (2.9),  so  that 


lim  inf  — a(n)2  log  E 


— Aff-f \Yn) 

g  a(n)z  v  7 


+  £ 


>  lim  inf  E 

n—KX) 


n—  1 


L*=0 


(4.1) 


Since  sup^Rd  |F(a:)|  <  Kp  for  some  Kp  <  oo,  we  also  have 


sup  a(n)2nE 
n 


n—  1 

L*=o 


n 


Lxy 


<  2 K p  +  e. 


Consequently  we  can  choose  a  subsequence  of  {if1}  (we  retain  n  as  the 
index  for  convenience)  along  which  the  conclusions  of  Theorem  2.5  hold. 
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Combining  this  with  (4.1)  gives 


lim  inf  —a(n)2  \ogE  1 "  n,"y2  ^ 


g  a(n)^ 


+  s 


>  lim  inf  E 

n— >oo 


n— 1 


^a(n)2fl(,?||w„)  +  F(y") 


Li=0 


>  E 


A  ll^(s)  llyl-1(A'0(s))  ds  +  F(Y  ) 


Recalling 


it  follows  that 


E 


Y(t)=  f  Db(X°(s))Y(s)ds+  f  w(s)ds, 
Jo  Jo 


L4o 


a  ll^(®)  II  a_1(x°(s))  ds  +  Fft ) 


>  inf 

«eL2([0.1]:Kd)  U o 

inf  {  f  1 

«eL2([0,l]®d)  Uo  2 


2  II'u(s)IIa-1(a°(s))  ds  +  F((f>v 


|u(s)||2ds  +  F(^1/2(x>)|? 


with  (j)u  defined  as  in  (2.8).  Since  £  >  0  is  arbitrary,  we  have  the  lower 
bound  (2.12). 


5  Laplace  Lower  Bound 

The  goal  of  this  section  is  to  prove  (2.13).  Note  that  for  u,v  G  T2([0, 1]  :  Rd) 
and  ^/HxO)uAX/Hxo)v  given  by  (2.8) 

^V2(X >(i)  _ 

=  j*Db(X°(s ))  (<^1/2(*>(s)  -  ds 

+  f  A1  J2(X°(s))(u(s)  -  v(s))ds. 

Jo 
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Thus  by  Gronwall’s  inequality 


sup 

te  [0,1] 


$ 


A^(X°)u^  _  ^1 2(X°> 


'(*) 


(5.1) 


<  (1  +  I<beKb)  [  A^2(X°{s))u(s)  -  A1/2(X°(s))u(s) 

Jo 


ds 


<  (1  +  KheKb)K1/2 


|u(s)  —  ^ (s)  || 2  ds 


Since  C([0, 1]  :  Rd)  is  dense  in  L2([0, 1]  :  Md),  the  proof  of  the  Laplace  lower 
bound  is  reduced  to  showing  that  for  an  arbitrary  m6C([0,1]  :  Rrf) 

lim  sup  — a(n)2  log  E 

n— >oo 

(5‘2) 

The  main  difficulty  is  to  deal  with  the  possible  degeneracy  of  the  noise. 
Recall  the  orthogonal  decomposition  of  A_1(x)  (2.2).  Define 

Aj/(x)  =  Q{x)hTIJ(x)QT{x) 

where  A^-1  (x)  is  the  diagonal  matrix  such  that  A~AK(x)  =  A^i1(x)  when 
A^(x)  <  K 2  and  A^i1K(x)  =  K 2  when  A^x(x)  >  K2.  Note  that  by  [20, 

Theorem  6.2.37]  A1(/2(x),  AKl(x)  and  Ajv-2(x)  are  continuous  functions  of 
A(x),  and  consequently  they  are  also  continuous  functions  of  x  £  Rrf.  In 
addition  define 


g  a(n)z  v  7 


1 

<  - 
-  2 


\u(s)\\2  ds  +  F  ^Al/2^u 


uK(s) 


u(s ) 

Ku(s) 

ll«00ll 


for  ||u(s)||  <  K 
for  ||u(s)||  >  K 


Let  <j>u'K{t)  =  ^a(x°)ak/2(x°)uk  (i)j  and  note  that  solves 

(j)u'K{t)  =  f  Db(X° (s))(j)u'K (s)ds 

Jo 

+  [  A{X°(s))A-1/2(X°{s))uK(s)ds.  (5.3) 

Jo 

To  simplify  notation  we  define  sf  =  i/n  and  sn(t)  =  \nt\  /n,  where  [aj 
is  the  integer  part  of  a.  Note  that  sn(t)  —  t  — >  0  uniformly  for  t  G  [0, 1]  as 
n  — >  oo.  For  n  sufficiently  large 


max 

0<i<n-l 


a(n)y/n 


A'k'2  (*"(»”))  «k(« 


< 


a{n)y/n 


K2  <  A  da 
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and  we  can  define  the  sequence  {(Xn,u’K ,  Yn,u,K ,  r]n,u,K ,  fjn,u,K)}  as  in  Con¬ 
struction  2.4  with 


n.u.K 

Vi 


(dy) 

=  exp  i  (  y 


A“1/2  (X°  (,?))  uK  (a?) 


’  a{ri)y/n 


-Hc  (  Xn'u'K 


AK/2  iX°  (SD)  UK  ( Si )  )  \  lJ>xn.«.K(dV)- 


’  a{n)y/n 

Using  (2.3)  and  the  fact  that 


yexp{(y,a)  -  Hc(x,a)}fj,x(dy)  =  DaHc(x,a), 


we  have  for  ||a||  <  A  da 


yexp{(y,a)  -  Hc(x,a)}nx{dy)  -  A(x)ot 


<Kda  HI2.  (5.4) 


The  next  result  identifies  the  limit  in  probability  of  the  controlled  processes 
and  an  asymptotic  bound  for  the  relative  entropies. 

Theorem  5.1  Let  u  €  C([0, 1]  :  Md)  and  K  <  oo  be  given,  construct 
{(Xn’u’K ,  Yn’u’K  1rjn,u,K ,f)n,u,K)}  as  in  this  section  and  define  fiu’K  by  (5.3). 
Then 

yn,u,K  (j)u,K 


(5.5) 


in  C([0, 1]  :  Md)  in  probability,  and 


limsup  a2{n)nE 


1  n—  1 

-Vs 

”h 


n.u.K 

Vi 


< 


Ak1/2(X°(s))uk(s ) 


fJj  j^n,u,K 

ds 

A(.X*{s)) 


(5.6) 


Proof.  Using  (5.4)  to  bound  the  second  term  and  (2.4)  to  bound  the  third, 
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for  n  satisfying  a(^^K2  <  XDA 


7-)  (  n.u.K 

R  ( rh 


fJj  j^n,u,K  J 


,n\  \  n,u,K 

'i 


(dy) 


1  =-4Jf1/2(^°  (»?))«(»?)) 

(X™-K)  A~K1/2  (X°  (»?))«*■(»?), 

a — 1/2  /  ,,fl  /  n.\\  /  7i.\\  2 


2 a(n)2n 
Consequently 


a(n)^/n 

1  /  1 
2  \  a(n)y/n 

-r 47S'4/-1/2  (x°  W»  « W>)  +  ~rwmKDAK 8 

a(n)vn  /  a{n)inilz 

m2  2 


i 


A 


A' 


i1/2  (a°  (sf))  rtx  (sf ) 2  „  +  .  ,/9. 

1  v  *  n  *  A(X™’U'K)  a(n)3n3/2 


lim  sup  a2(n)nE 

n— >oo 


<  lim  sup  -  E 

n — >oo  2 


n— 1 


(  n.u.K 

\ 

[Vi 

fJj ^n,u,K  J 

i=0 
1  n— 1 

-ElbA2  (*”«»“*■  w) 


(5.7) 


i=0 


A(X”’U’K) 


where  in  fact 


lim  sup  -i? 

n— >oo  2 


^  n—  1 

-E||V2  (*“ «))«  w) 


i=0 


A(X^u’k) 


<  -I\Ka. 
~  2 


Therefore  (2.14)  is  satisfied  by  {r/n,u’^},  so  we  can  apply  Theorem  2.5  and 
choose  a  subsequence  (keeping  n  as  the  index  for  convenience)  along  which 
{(fin,u,K ,  Yn,u’K)}  converges  weakly  to  some  limit  (f)u,K ,YU’K),  where  fj^'K 
is  Lebesgue  measure  and 

Yu’K(t)=  f  Db(X°(s))Yu’K(s)ds+  f  [  yfiuR(dy\s)ds. 

Jo  Jo  JRd 
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This  implies 

sup  || Xn’u'K(t)  -  A°(t)||  -*•  0  (5.8) 

te[o,i] 

in  probability.  Because  of  this,  the  uniform  bound  on  A1/2  (x)  and  the 
continuity  of  A1/2(x),  we  have  (recall  that  sn(t )  =  [nt\  /n) 

sup  Al/2{Xn'u'K(sn{t )))  -  A1/2(X°(sn(t)))  ->  0 
te[o,i] 

in  probability.  However,  the  continuity  of  A1/2(X°)Aj}/2  (X°)uk  gives 

sup  Al/2{X\sn{t)))A-V\x\snmuK{sn(t)) 
te[ o.i] 

-A1/2(A°(t))A'1/2(A°(f))u^(t) ||  -  0. 

Combining  these  limits,  and  using  the  fact  that  Ar1^2(X°)uk  is  uniformly 
bounded,  shows  that 

sup  A1/2(Xn^K(snmAK/2(X°(snmuK(sn(t))  (5.9) 
te[  o,i] 

-A1/2(X°(t))A~1/2(A°(t))u^(t)||  -  0 

in  probability.  This  combined  with  the  uniform  bound  on  Ar^2(X°)uk 
and  dominated  convergence  gives 

limsup.E  -  [  AZ}/2(X°(sn(t)))uK{sn(t ))  dt 

n-W  [2  Jo  K  V  y  A{xn,u,K{snm  J 

=  -  [  A~r}^2 (X° dt. 

2  Jo  R  W  A(X°(t)) 

Combining  this  with  (5.7)  shows  (5.6). 

To  prove  (5.5)  we  will  show  that  in  fact 

f)u,K (dy  <g>  dt)  =  SA{x0{t))A-i/2{x0 {t))uK{t)(dy)dt. 

For  all  a  >  0  let 

Ga  =  {(z,t)eRdx[0,l]  :  \\z  -  A{X°(t))A-1/2(X°(t))uK{t)\\  <  u}  , 
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and  note  that  by  weak  convergence  lim  supn^^  E[fjn,u,K (G<r)]  <  E[fju,K  (Go-)]. 
Note  also  that 

E[f]n,u,K  (Ga)\ 

>P  sup  a(ri)y/n  f  y^  f  (dy)  -  A(X0(t))A^1/2(X0(t))uK(t)  <  cr 
te [0,1]  JRd  L  J 

However,  by  (5.4)  we  can  choose  n  large  enough  to  make 


-A  (»”(<)))  A]X  (X°  (»"(*)))  uK  (>"(t)) 

arbitrarily  small,  and  the  proof  that 

sup  A(Xn’u’K(snmAK1/2(X°(snmuK(sn(t)) 
te[  o,i] 

-A(X°{t))A-1/2(X°(t))uK(t) ||  -  0 

in  probability  is  identical  to  the  proof  of  (5.9).  Therefore  limsupre_>00  E[rju,K'n(Gcr)\  = 
1  for  all  a  >  0,  and  so  E[fju,K  (nnei>jGi/n)]  =  1.  This  implies  that  with  prob¬ 
ability  1 

V{\2  (dy\t)  =  $ A(X°(t))A-1/2(x°(t))uK (^) 
for  a.e.  t.  It  follows  that 

Yu,K(t)  =  [  Db(X° (s))Yu,K (s)ds  +  [  A(X°(s))A~1/2(X°(s))uK(s)ds, 

Jo  Jo 

and  therefore  Yn,u,K  — >  (j)u,K  weakly.  This  implies  (5.5)  and  completes  the 
proof.  ■ 

The  second  theorem  in  this  section  allows  us  to  approximate  F{(j)A  A°l“) 
by  F(<t>u’K)  and  \  fj  ||xt(s)  ||2  ds  by 

-  [  A~k1^2(X°(s))uk(s)  ds. 

2  Jo  K  y  y  ”  Ky  ’  A{X o(a)) 

Theorem  5.2  Let  u  £  G( [0, 1]  :  Md)  and  define  (x°'>u  by  (2.8)  and 
4>u,k  by  (5.3).  Then  as  K  — ■>  oo 
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in  C([0, 1]  :  Md)  and 


sup 


i^E(0,oo)  ^  J 0 

Proof.  Note  that 


[  Ar,.1^ (X°(s))uk(s)  ds  <  -  [  ||-u(s)||2ds. 

Jo  K  A(X°(s))  “2  J0  n  K  Jn 


A1/2 (X° (s))AR1/2 (X° (s))uK(s)  <  ||u(s) 


for  all  s  G  [0, 1]  and  K  G  (0,  oo)  so 
1 


SUP  o  / 

i^G(0,oo)  ^  J  0 


Ak1/2(X°(s))uk{s  " 


1 


ds  <  -  IHslf  ds. 
A(XO(s))  2  J0 


In  addition, 


and 


A^iX^s^A^iX^s^A^iX^s^UKis)  -  A1/2(A°(s))n(s) 


A1/2(X0(s))A1/2(X°(s))A~1/2(X°(s))uK(s)  <  A1/2(A°(s))t6(s) 


for  all  s  G  [0, 1]  so  dominated  convergence  gives 

A1/2(X0)A1/2(X°)Aj}/2(X0)uK  ->  A1/2(A°) 


u 


in  A1([0, 1]  :  Rrf).  Combining  this  with  the  second  line  of  (5.1)  shows  that 

au,k  _ A  ,AV2(X0)U 


in  C([0,1]  : 

Using  (2.9)  and  the  fact  that  any  given  control  is  suboptimal, 
—  a(n)2  logU 


g  a(n)z  v  7 


<  E 


n—  1 


a{n)2R 


n.u.K 

Vi 


Li=0 


Vxn 


+  F(Yn'u’K) 


Using  Theorem  5.1,  this  implies 
lim  sup  — a(n)2  log  E 


g  a(n)z  v  7 


1 

<  - 
-  2 


I  Ak1/2(X°(s))uk(s 


A(X0(S)) 


ds  +  F((j) 


u.K  \ 


Sending  K  — >  oo  and  using  Theorem  5.2  gives  (5.2),  and  hence  completes 
the  proof  of  the  lower  bound  (2.13). 
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